Optimizing the Performance of Wintel Applications
نویسنده
چکیده
The best medicine for most sorts of performance problems is invariably preventative. Despite advance in software performance engineering [1,2], developing complex computer programs that are both functionally correct and efficient remains a difficult and time-consuming task. This paper specifically looks at tuning Windows NT applications running on Intel hardware from the perspective of optimizing processor cycles and resource usage. Fine-tuning the execution path of code remains one of the fundamental disciplines of performance engineering. To bring this topic into focus, I will describe a case study where an application designed and developed specifically for the Microsoft Windows NT environment is subjected to a rigorous analysis of its performance using several commercially available CPU execution profiling tools. Since one of the development tools used to optimize the application program under consideration requires an understanding of the internal workings of Intel processors, this will justify an excursion into the area of Intel processor hardware performance. An application tuning case study. The application that is the target of this analysis is a C language program that was written to collect Windows NT performance data continuously on an interval basis. The performance of this application is quite important. Since the app is designed primarily for use as a performance tool, it is very important that it run efficiently. A tool designed to diagnose performance problems should not itself be responsible for causing performance problems. Moreover, the customers for this application, many of whom are experienced Windows NT performance analysts, are a very demanding group of users. Figure 1 illustrates the application structure and flow, which is straightforward. Following initialization, the program we developed enters a continuous data collection loop. Inside this loop, Windows NT services are called to retrieve selected performance data across a well-documented Win32 interface. The program, which consisted of a single executable module called HQTIVJWWI\I, simply gathers performance statistics across this interface and logs the information collected to a local disk file. It is the optimization of the code within this inner loop that was the focus of this study.
منابع مشابه
Improving Growth and Performance of Young Almond Trees in Nursery by Optimizing Mineral Nutrition
Short growing season restricts production of standard-sized fruit trees in nurseries at cold regions. Enhancing plant growth by optimizing program of mineral nutrition may solve the problem. This study evaluated efficiency of fertilizers [urea, sulfur coated urea (SCU), or foliar applications of a NPK compound fertilizer] for optimizing the growth of seedling rootstocks and grafted young almond...
متن کاملScalable Distributed Data Structures for High-Performance Databases
Present databases, whether on centralized or parallel DBMSs, do not deal well with scalability. We present an architecture for Wintel multicomputers termed AMOS-SDDS, coupling a high-performance main-memory DBMS AMOS-II and a manager of Scalable Distributed Data Structures SDDS-2000. SDDS-2000 provides the scalable data partitioning in distributed RAM, supporting parallel scans with function sh...
متن کاملIP-Related Refusals to Deal. Part 1: Updating the Intel-Intergraph Controversy
Considerable attention has focused on US District Court Judge Jackson’s preliminary fact findings about the Microsoft half of the Wintel monopoly. But the November 5, 1999, decision of the US Court of Appeals for the Federal Circuit, exonerating allegedly monopolistic conduct of the Intel half of Wintel-dom, has gone almost unremarked. This month’s Micro Law updates the July-August 1998 Micro L...
متن کاملEfficient Personal Supercomputing in Fortran 9x on CPU-GPU Systems
The availability of graphics-processors based compute devices and multi-core host architectures with larger memories on both means that it is possible to run relatively large scientific computing problems on “personal” machines. For wide adoption by scientists and to achieve an increase in their productivity these architectures must be relatively easy to use in the languages scientists use (amo...
متن کاملمدل عملکردی تحلیلی FPGA برای پردازش با قابلیت پیکربندی مجدد
Optimizing FPGA architectures is one of the key challenges in digital design flow. Traditionally, FPGA designers make use of CAD tools for evaluating architectures in terms of the area, delay and power. Recently, analytical methods have been proposed to optimize the architectures faster and easier. A complete analytical power, area and delay model have received little attention to date. In addi...
متن کامل